{"cells":[{"attachments":{},"cell_type":"markdown","metadata":{"id":"11n5gndbRzoY"},"source":["# Array-Oriented Programming with NumPy\n"]},{"attachments":{},"cell_type":"markdown","metadata":{"id":"AIFsv_RZ1iV0"},"source":["\n"," \n"," \n","
\n"," \"Open\n"," \n"," \n","
"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["## Introduction"]},{"attachments":{},"cell_type":"markdown","metadata":{"id":"wJjIP5Vne7cM"},"source":["The `NumPy` (Numerical Python) library first appeared in 2006 and is the preferred Python array implementation. It offers a high-performance, richly functional n-dimensional array type called `array.` Operations on arrays are up to one or two orders of magnitude faster than those on `lists`. \n","\n","In this chapter, we explore the array's basic capabilities. The built-in lists can have multiple dimensions, and you generally process multi-dimensional lists with nested loops or list comprehensions with multiple clauses. A strength of `NumPy` is \"array-oriented programming,\" **which uses functional-style programming with internal iteration to make array manipulations concise and straightforward**, eliminating the kinds of bugs that can occur with explicitly programmed loops. "]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["In `Python` the types are dynamically inferred and we do not have to allocate the memory by ourselves. This type of flexibility also points to the fact that `Python` variables are more than just their values; they also contain extra information about the type and the size of the value:"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["
\n","
source: https://jakevdp.github.io/PythonDataScienceHandbook/figures/cint_vs_pyint.png
"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Similarly, the `list` in `Python` is very flexible that can store heterogeneous objects. But this flexibility comes at a cost: to allow these flexible types, each item in the list must contain its type, size, and other information. Every element is a complete `Python` object. In the special case that all variables are of the same type, **much of this information is redundant**, so storing the data in a fixed-type array can be much more efficient. The difference between a dynamic-type list and a fixed-type (`NumPy`-style) array is illustrated:"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["
\n","
source: https://jakevdp.github.io/PythonDataScienceHandbook/figures/array_vs_list.png
"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["At the implementation level, the `array` essentially contains a single pointer to one contiguous block of data. The `Python` `list`, on the other hand, includes a pointer to a block of pointers, each of which in turn points to a whole `Python` object like the `Python` integer we saw earlier. \n","\n","> The advantage of the `list` is flexibility: because each list element is a full structure containing both data and type information, the list can be filled with data of any desired type. Fixed-type `NumPy`-style arrays lack this flexibility but are much more efficient for storing and manipulating data."]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["From the previous lecture, we know that every object consists of data and methods. The `ndarray` object of the `NumPy` package not only provides efficient storage of array-based data but adds to this **efficient operations** on that data. "]},{"attachments":{},"cell_type":"markdown","metadata":{"id":"SwFKFBMwRzoa"},"source":["## Creating `array` from Existing Data (Constructor)"]},{"cell_type":"code","execution_count":1,"metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["numpy is already installed.\n"]}],"source":["package_name = \"numpy\"\n","\n","try:\n"," __import__(package_name)\n"," print(f\"{package_name} is already installed.\")\n","except ImportError:\n"," print(f\"{package_name} not found. Installing...\")\n"," %pip install {package_name}"]},{"attachments":{},"cell_type":"markdown","metadata":{"id":"xtOzCNTJcj3N"},"source":["The `NumPy` documentation recommends importing the `numpy` module as `np` so that you can access its members with \"`np.`\""]},{"cell_type":"code","execution_count":2,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":4,"status":"ok","timestamp":1668686223113,"user":{"displayName":"phonchi chung","userId":"13517391734500420886"},"user_tz":-480},"id":"ossL66xxcoGg","outputId":"6fac1a14-1607-409b-c6f1-0b05188f6e2c"},"outputs":[],"source":["import numpy as np"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["### Creating `array` using from fix sequence"]},{"attachments":{},"cell_type":"markdown","metadata":{"id":"Ss9hRGHDRzob"},"source":["The `numpy` module provides various functions for creating arrays. Here we use the `array()` function, which receives a collection of elements and returns a new array containing the argument's elements. Let’s pass a `list` for example: "]},{"cell_type":"code","execution_count":3,"metadata":{},"outputs":[{"data":{"text/plain":["(array([ 2, 3, 5, 7, 11]), numpy.ndarray)"]},"execution_count":3,"metadata":{},"output_type":"execute_result"}],"source":["numbers = np.array([2, 3, 5, 7, 11])\n","numbers, type(numbers)"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["The `array()` function copies its argument's contents into the `array`. Note that the type is `numpy.ndarray`, but all arrays are output as \"array.\""]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["### Multidimensional Arguments"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["The `array()` function copies its argument's dimensions. Let's create an `array` from a two-row-by-three-column `list`:"]},{"cell_type":"code","execution_count":4,"metadata":{},"outputs":[{"data":{"text/plain":["(array([[1, 2, 3],\n"," [4, 5, 6]]),\n"," numpy.ndarray)"]},"execution_count":4,"metadata":{},"output_type":"execute_result"}],"source":["np.array([[1, 2, 3], [4, 5, 6]]), type(np.array([[1, 2, 3], [4, 5, 6]]))"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["#### `array` Attributes "]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["The `array` function determines an array's element type from its argument's elements. You can check the element type with an array's `dtype` attribute:"]},{"cell_type":"code","execution_count":5,"metadata":{},"outputs":[{"data":{"text/plain":["(dtype('int32'), dtype('float64'))"]},"execution_count":5,"metadata":{},"output_type":"execute_result"}],"source":["integers = np.array([[1, 2, 3], [4, 5, 6]])\n","floats = np.array([0.0, 0.1, 0.2, 0.3, 0.4])\n","\n","integers.dtype, floats.dtype"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["As you’ll see in the next section, various array-creation functions receive a `dtype` keyword argument so you can specify an array’s element type. \n","\n","For performance reasons, `NumPy` is written in the C programming language and uses C's data types. By default, `NumPy` stores integers as the `NumPy` type `int_` values — which correspond to 32-bit (4-byte) integers in C (this may be platform-dependent) — and stores floating-point numbers as the NumPy type `float64` values — which correspond to 64-bit (8-byte) floating-point values (double) in C. In our examples, most commonly, you'll see the types `int32`, `float64` and `bool` for non-numeric data (such as strings). The complete list of supported types is at [https://docs.scipy.org/doc/numpy/user/basics.types.html](https://docs.scipy.org/doc/numpy/user/basics.types.html). "]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["The attribute `ndim` contains an array's number of dimensions and the attribute `shape` contains a tuple specifying an array's dimensions: "]},{"cell_type":"code","execution_count":6,"metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["2\n","1\n"]}],"source":["print(integers.ndim)\n","print(floats.ndim)"]},{"cell_type":"code","execution_count":7,"metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["(2, 3)\n","(5,)\n"]}],"source":["print(integers.shape)\n","print(floats.shape)"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Here, integers have 2 rows and 3 columns (6 elements) and floats are one-dimensional, containing 5 floating numbers."]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["You can view an array’s total number of elements with the attribute `size` and the number of bytes required to store each element with `itemsize`:"]},{"cell_type":"code","execution_count":8,"metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["6\n","4\n","5\n","8\n"]}],"source":["print(integers.size)\n","print(integers.itemsize)\n","print(floats.size)\n","print(floats.itemsize)"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Note that the integers' size is the product of the shape tuple's values — two rows of three elements each for a total of six elements. In each case, `itemsize` is 4 because integers contain `int32` values and 8 since floats contain `float64` values."]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["### Filling `array` with Specific Values"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["`NumPy` provides functions `zeros()`, `ones()` and `full()` for creating arrays containing 0s, 1s or a specified value, respectively. By default, `zeros()` and `ones()` create arrays containing `float64` values. We’ll show how to customize the element type momentarily. The first argument to these functions must be an integer or a tuple of integers specifying the desired dimensions. For an integer, each function returns a one-dimensional array with the specified number of elements:"]},{"cell_type":"code","execution_count":9,"metadata":{},"outputs":[{"data":{"text/plain":["array([0., 0., 0., 0., 0.])"]},"execution_count":9,"metadata":{},"output_type":"execute_result"}],"source":["np.zeros(5)"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["For a tuple of integers, these functions return a multidimensional array with the specified dimensions. You can specify the array's element type with the `zeros()` and `ones()` function’s `dtype` keyword argument:"]},{"cell_type":"code","execution_count":10,"metadata":{},"outputs":[{"data":{"text/plain":["array([[1, 1, 1, 1],\n"," [1, 1, 1, 1]], dtype=int64)"]},"execution_count":10,"metadata":{},"output_type":"execute_result"}],"source":["np.ones((2, 4), dtype=np.int64)"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["The array returned by `full()` contains elements with the second argument's value and type: "]},{"cell_type":"code","execution_count":11,"metadata":{},"outputs":[{"data":{"text/plain":["array([[13, 13, 13, 13, 13],\n"," [13, 13, 13, 13, 13],\n"," [13, 13, 13, 13, 13]])"]},"execution_count":11,"metadata":{},"output_type":"execute_result"}],"source":["np.full((3, 5), 13)"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["### Creating `array` from sequence generated by different methods"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["#### Creating sequence with fix step by `arange()` "]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Let's use `NumPy`'s `arange` function to create integer ranges — similar to using the built-in function `range`. In each case, `arange` first determines the resulting array’s number of elements, allocates the memory, then stores the specified range of values in the array: "]},{"cell_type":"code","execution_count":12,"metadata":{},"outputs":[{"data":{"text/plain":["array([0, 1, 2, 3, 4])"]},"execution_count":12,"metadata":{},"output_type":"execute_result"}],"source":["np.arange(5)"]},{"cell_type":"code","execution_count":13,"metadata":{},"outputs":[{"data":{"text/plain":["array([5, 6, 7, 8, 9])"]},"execution_count":13,"metadata":{},"output_type":"execute_result"}],"source":["np.arange(5, 10)"]},{"cell_type":"code","execution_count":14,"metadata":{},"outputs":[{"data":{"text/plain":["array([10, 8, 6, 4, 2])"]},"execution_count":14,"metadata":{},"output_type":"execute_result"}],"source":["np.arange(10, 1, -2) "]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["> It is the same as `range()` which takes three arguments `numpy.arange(start, stop, step)`"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["#### Creating sequence with fix sample number by `linspace()`"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["You can produce evenly spaced floating-point ranges with `NumPy`'s `linspace()` function. The function’s first two arguments specify the starting and ending values in the range, **and the ending value is included in the array**. The optional keyword argument `num` specifies the number of evenly spaced values to produce:"]},{"cell_type":"code","execution_count":15,"metadata":{},"outputs":[{"data":{"text/plain":["array([0. , 0.25, 0.5 , 0.75, 1. ])"]},"execution_count":15,"metadata":{},"output_type":"execute_result"}],"source":["np.linspace(0.0, 1.0, num=5)"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["#### Reshaping an `array` "]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["You also can create an `array` from a range of elements, then use the array method `reshape()` to transform the one-dimensional array into a multidimensional array. Let's create an `array` containing the values from 1 through 20, then reshape it into four rows by five columns:"]},{"cell_type":"code","execution_count":16,"metadata":{},"outputs":[{"data":{"text/plain":["array([[ 1, 2, 3, 4, 5],\n"," [ 6, 7, 8, 9, 10],\n"," [11, 12, 13, 14, 15],\n"," [16, 17, 18, 19, 20]])"]},"execution_count":16,"metadata":{},"output_type":"execute_result"}],"source":["np.arange(1, 21).reshape(4, 5)"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Note the ***chained method*** calls in the preceding snippet. First, `arange` produces an array containing the values 1–20. Then we call `reshape()` on that array to get the 4-by-5 array that was displayed. You can `reshape()` any array, provided that the new shape has the same number of elements as the original. So a six-element one-dimensional array can become a 3-by-2 or 2-by-3 array, and vice versa! "]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["### `List` vs. `array` Performance: Introducing `%timeit` "]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Most `array` operations execute significantly faster than corresponding `list` operations. To demonstrate, we’ll use the `%timeit` magic command, which times the average duration of operations. "]},{"cell_type":"code","execution_count":17,"metadata":{},"outputs":[],"source":["import random"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Here, let’s use the `random` module’s `randint()` function with a list comprehension to create a list of six million die rolls and time the operation using `%timeit`:"]},{"cell_type":"code","execution_count":18,"metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["3.67 s ± 15.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"]}],"source":["%timeit rolls_list = [random.randint(1, 6) for i in range(0, 6_000_000)] #_ is use to separate long integer"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["> By default, `%timeit` executes a statement in a loop, and it runs the loop seven times. If you do not indicate the number of loops, `%timeit` chooses an appropriate value."]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Now, let's use the `randint()` function from the `numpy.random` module to create an array"]},{"cell_type":"code","execution_count":19,"metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["42 ms ± 805 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"]}],"source":["%timeit rolls_array = np.random.randint(1, 7, 6_000_000)"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["## Indexing and Slicing (Getter and Setter)"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["One-dimensional arrays can be indexed and sliced using the same syntax and techniques demonstrated in the \"Lists and Tuples\" chapter. Here, we focus on array-specific indexing and slicing capabilities. "]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["To select an element in a two-dimensional array, specify a tuple containing the element's row and column indices in square brackets:"]},{"cell_type":"code","execution_count":20,"metadata":{},"outputs":[{"data":{"text/plain":["array([[ 87, 96, 70],\n"," [100, 87, 90],\n"," [ 94, 77, 90],\n"," [100, 81, 82]])"]},"execution_count":20,"metadata":{},"output_type":"execute_result"}],"source":["grades = np.array([[87, 96, 70], [100, 87, 90],\n"," [94, 77, 90], [100, 81, 82]])\n","grades"]},{"cell_type":"code","execution_count":21,"metadata":{},"outputs":[{"data":{"text/plain":["96"]},"execution_count":21,"metadata":{},"output_type":"execute_result"}],"source":["grades[0, 1] # row 0, column 1"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["To select a single row, specify only one index in square brackets:"]},{"cell_type":"code","execution_count":22,"metadata":{},"outputs":[{"data":{"text/plain":["array([100, 87, 90])"]},"execution_count":22,"metadata":{},"output_type":"execute_result"}],"source":["grades[1]"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["To select multiple sequential rows, use slice notation:"]},{"cell_type":"code","execution_count":23,"metadata":{},"outputs":[{"data":{"text/plain":["array([[ 87, 96, 70],\n"," [100, 87, 90]])"]},"execution_count":23,"metadata":{},"output_type":"execute_result"}],"source":["grades[0:2]"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["To select multiple non-sequential rows, use a list of row indices (fancy indexing):"]},{"cell_type":"code","execution_count":24,"metadata":{},"outputs":[{"data":{"text/plain":["array([[100, 87, 90],\n"," [100, 81, 82]])"]},"execution_count":24,"metadata":{},"output_type":"execute_result"}],"source":["grades[[1, 3]]"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Let's select only the elements in the first column: "]},{"cell_type":"code","execution_count":25,"metadata":{},"outputs":[{"data":{"text/plain":["array([ 87, 100, 94, 100])"]},"execution_count":25,"metadata":{},"output_type":"execute_result"}],"source":["grades[:, 0]"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["The 0 after the comma indicates that we're selecting only column 0. The `:` before the comma indicates which rows within that column to select. **In this case, `:` is a slice representing all rows**. You can select consecutive columns using a slice:"]},{"cell_type":"code","execution_count":26,"metadata":{},"outputs":[{"data":{"text/plain":["array([[96, 70],\n"," [87, 90],\n"," [77, 90],\n"," [81, 82]])"]},"execution_count":26,"metadata":{},"output_type":"execute_result"}],"source":["grades[:, 1:3]"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["or specific columns using a list of column indices:"]},{"cell_type":"code","execution_count":27,"metadata":{},"outputs":[{"data":{"text/plain":["array([[ 87, 70],\n"," [100, 90],\n"," [ 94, 90],\n"," [100, 82]])"]},"execution_count":27,"metadata":{},"output_type":"execute_result"}],"source":["grades[:, [0, 2]]"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["`array` is mutable. Therefore, if we want to modify the value of the array, we can use the previous method and put the result on the left-hand side: "]},{"cell_type":"code","execution_count":28,"metadata":{},"outputs":[{"data":{"text/plain":["array([[ 87, 96, 70],\n"," [100, 87, 90],\n"," [ 94, 77, 90],\n"," [100, 81, 42]])"]},"execution_count":28,"metadata":{},"output_type":"execute_result"}],"source":["grades[3, 2] = 42\n","grades"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["### Views: Shallow Copies"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["***Views*** are objects \"see\" the data in other objects, rather than having their own copies of the data. Views are also known as ***shallow copies***. Various `array` methods and slicing operations produce views of an array's data. The `array` method `view()` returns a new array object with a view of the original array object's data. First, let’s create an array and a view of that array:"]},{"cell_type":"code","execution_count":30,"metadata":{},"outputs":[],"source":["numbers = np.arange(1, 6)\n","numbers2 = numbers.view()"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["We can use the built-in `id()` function to see that `numbers` and `numbers2` are different objects:"]},{"cell_type":"code","execution_count":31,"metadata":{},"outputs":[{"data":{"text/plain":["(1664107572080, 1664073164144)"]},"execution_count":31,"metadata":{},"output_type":"execute_result"}],"source":["id(numbers), id(numbers2)"]},{"cell_type":"code","execution_count":32,"metadata":{},"outputs":[{"data":{"text/plain":["True"]},"execution_count":32,"metadata":{},"output_type":"execute_result"}],"source":["np.shares_memory(numbers, numbers2)"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["**To prove that `numbers2` views the same data as `numbers`**, let's modify an element in `numbers`, then display both arrays:"]},{"cell_type":"code","execution_count":33,"metadata":{},"outputs":[{"data":{"text/plain":["array([ 1, 20, 3, 4, 5])"]},"execution_count":33,"metadata":{},"output_type":"execute_result"}],"source":["numbers[1] *= 10\n","numbers"]},{"cell_type":"code","execution_count":34,"metadata":{},"outputs":[{"data":{"text/plain":["array([ 1, 20, 3, 4, 5])"]},"execution_count":34,"metadata":{},"output_type":"execute_result"}],"source":["numbers2"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Similarly, changing a value in the view also changes that value in the original array:"]},{"cell_type":"code","execution_count":35,"metadata":{},"outputs":[{"data":{"text/plain":["(array([1, 4, 3, 4, 5]), array([1, 4, 3, 4, 5]))"]},"execution_count":35,"metadata":{},"output_type":"execute_result"}],"source":["numbers2[1] /= 5\n","numbers, numbers2"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Slices also create views. Let’s make `numbers2` a slice that views only the first three elements of numbers:"]},{"cell_type":"code","execution_count":36,"metadata":{},"outputs":[{"data":{"text/plain":["array([1, 4, 3])"]},"execution_count":36,"metadata":{},"output_type":"execute_result"}],"source":["numbers2 = numbers[0:3]\n","numbers2"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Again, we can confirm that `numbers` and `numbers2` are different objects with `id()`: "]},{"cell_type":"code","execution_count":38,"metadata":{},"outputs":[{"data":{"text/plain":["(1664107572080, 1664107667152, True)"]},"execution_count":38,"metadata":{},"output_type":"execute_result"}],"source":["id(numbers), id(numbers2), np.shares_memory(numbers, numbers2)"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Now, let's modify an element both arrays share, then display them. Again, we see that `numbers2` is a view of `numbers`:"]},{"cell_type":"code","execution_count":39,"metadata":{},"outputs":[{"data":{"text/plain":["array([ 1, 80, 3, 4, 5])"]},"execution_count":39,"metadata":{},"output_type":"execute_result"}],"source":["numbers[1] *= 20\n","numbers"]},{"cell_type":"code","execution_count":40,"metadata":{},"outputs":[{"data":{"text/plain":["array([ 1, 80, 3])"]},"execution_count":40,"metadata":{},"output_type":"execute_result"}],"source":["numbers2"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["> Note that this behavior is different from `list`, where the slicing will create a new sub `list`! "]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["### Deep Copies"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Though views are separate `array` objects, they save memory by sharing element data from other arrays. However, when sharing mutable values, sometimes creating a ***deep copy*** with independent copies of the original data is necessary. This is especially important in multi-core programming, where separate parts of your program could attempt to modify your data at the same time, possibly corrupting it. "]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["The `array` method `copy()` returns a new array object with a deep copy of the original array object's data. First, let's create an array and a deep copy of that array:"]},{"cell_type":"code","execution_count":41,"metadata":{},"outputs":[],"source":["numbers = np.arange(1, 6)\n","numbers2 = numbers.copy()"]},{"cell_type":"code","execution_count":43,"metadata":{},"outputs":[{"data":{"text/plain":["(1664107664944, 1664107665712, False)"]},"execution_count":43,"metadata":{},"output_type":"execute_result"}],"source":["id(numbers), id(numbers2), np.shares_memory(numbers, numbers2)"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["To prove that `numbers2` has a separate copy of the data in `numbers`, let’s modify an element in `numbers`, then display both arrays: "]},{"cell_type":"code","execution_count":42,"metadata":{},"outputs":[{"data":{"text/plain":["array([ 1, 20, 3, 4, 5])"]},"execution_count":42,"metadata":{},"output_type":"execute_result"}],"source":["numbers[1] *= 10\n","numbers"]},{"cell_type":"code","execution_count":44,"metadata":{},"outputs":[{"data":{"text/plain":["array([1, 2, 3, 4, 5])"]},"execution_count":44,"metadata":{},"output_type":"execute_result"}],"source":["numbers2"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["> Recall that if you need deep copies of other types of `Python` objects, pass them to the `copy` module’s `deepcopy()` function. "]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["### More about Reshaping and Transposing "]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["We've used `array` method `reshape()` to produce two-dimensional arrays from one-dimensional `array`. `NumPy` provides various other ways to reshape `arrays`."]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["The array methods `reshape()` and `resize()` both enable you to change an array's dimensions. Method `reshape()` returns a view (shallow copy) of the original array with the new dimensions. It does not modify the original array:"]},{"cell_type":"code","execution_count":58,"metadata":{},"outputs":[{"data":{"text/plain":["array([[ 87, 96, 70],\n"," [100, 87, 90]])"]},"execution_count":58,"metadata":{},"output_type":"execute_result"}],"source":["grades = np.array([[87, 96, 70], [100, 87, 90]])\n","grades"]},{"cell_type":"code","execution_count":59,"metadata":{},"outputs":[],"source":["grades2 = grades.reshape(1, 6)"]},{"cell_type":"code","execution_count":60,"metadata":{},"outputs":[{"data":{"text/plain":["(array([[ 0, 96, 70, 100, 87, 90]]),\n"," array([[ 0, 96, 70],\n"," [100, 87, 90]]))"]},"execution_count":60,"metadata":{},"output_type":"execute_result"}],"source":["grades2[0, 0] = 0\n","grades2, grades"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["A common trick is that you can use `-1` to specify the shape in `resahpe()`. The length of the dimension set to `-1` is automatically determined by inferring from the specified values of other dimensions:"]},{"cell_type":"code","execution_count":61,"metadata":{},"outputs":[{"data":{"text/plain":["array([[ 0, 96],\n"," [ 70, 100],\n"," [ 87, 90]])"]},"execution_count":61,"metadata":{},"output_type":"execute_result"}],"source":["grades.reshape(-1, 2) # Same as grades.reshape(3, 2)"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Method `resize()` modifies the original array's shape in-place. It does not return a value:"]},{"cell_type":"code","execution_count":62,"metadata":{},"outputs":[{"data":{"text/plain":["array([[ 0, 96, 70, 100, 87, 90]])"]},"execution_count":62,"metadata":{},"output_type":"execute_result"}],"source":["grades.resize(1, 6)\n","grades"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["We can also do the opposite operation, which takes a multidimensional `array` and flatten it into a single dimension with the methods `flatten()` and `ravel()`. Method `flatten()` deep copies the original `array`'s data:"]},{"cell_type":"code","execution_count":63,"metadata":{},"outputs":[{"data":{"text/plain":["array([[ 87, 96, 70],\n"," [100, 87, 90]])"]},"execution_count":63,"metadata":{},"output_type":"execute_result"}],"source":["grades = np.array([[87, 96, 70], [100, 87, 90]])\n","grades"]},{"cell_type":"code","execution_count":64,"metadata":{},"outputs":[{"data":{"text/plain":["array([ 87, 96, 70, 100, 87, 90])"]},"execution_count":64,"metadata":{},"output_type":"execute_result"}],"source":["flattened = grades.flatten()\n","flattened"]},{"cell_type":"code","execution_count":65,"metadata":{},"outputs":[{"data":{"text/plain":["array([[ 87, 96, 70],\n"," [100, 87, 90]])"]},"execution_count":65,"metadata":{},"output_type":"execute_result"}],"source":["flattened[0] = 100\n","grades"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Method `ravel()` produces a view of the original `array`, which shares the `grades` `array`'s data!"]},{"cell_type":"code","execution_count":66,"metadata":{},"outputs":[{"data":{"text/plain":["array([ 87, 96, 70, 100, 87, 90])"]},"execution_count":66,"metadata":{},"output_type":"execute_result"}],"source":["raveled = grades.ravel()\n","raveled"]},{"cell_type":"code","execution_count":67,"metadata":{},"outputs":[{"data":{"text/plain":["array([[100, 96, 70],\n"," [100, 87, 90]])"]},"execution_count":67,"metadata":{},"output_type":"execute_result"}],"source":["raveled[0] = 100\n","grades"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Additionally, we can effortlessly transpose an `array`'s rows and columns, causing the rows to turn into columns and the columns into rows. The `T` attribute returns a transposed view (shallow copy) of the array. Assume that the original `grades` `array` presents two students' grades (the rows) across three exams (the columns). Let's transpose the rows and columns to examine the data as the grades for three exams (the rows) taken by two students (the columns):"]},{"cell_type":"code","execution_count":69,"metadata":{},"outputs":[{"data":{"text/plain":["array([[100, 100],\n"," [ 96, 87],\n"," [ 70, 90]])"]},"execution_count":69,"metadata":{},"output_type":"execute_result"}],"source":["transpose = grades.T\n","transpose"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Transposing does not modify the original array but it does create a view of the original array's data:"]},{"cell_type":"code","execution_count":70,"metadata":{},"outputs":[{"data":{"text/plain":["array([[ 0, 96, 70],\n"," [100, 87, 90]])"]},"execution_count":70,"metadata":{},"output_type":"execute_result"}],"source":["transpose[0, 0] = 0\n","grades"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["You can combine arrays by adding more columns or more rows — known as horizontal stacking and vertical stacking. Let's create another 2-by-3 array of grades:"]},{"cell_type":"code","execution_count":71,"metadata":{},"outputs":[{"data":{"text/plain":["array([[ 94, 77, 90],\n"," [100, 81, 82]])"]},"execution_count":71,"metadata":{},"output_type":"execute_result"}],"source":["grades2 = np.array([[94, 77, 90], [100, 81, 82]])\n","grades2"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Let's assume `grades2` represents three additional exam grades for the two students in the `grades` array. We can combine `grades` and `grades2` with `NumPy`'s `hstack()` (horizontal stack) function by passing a `tuple` containing the arrays to combine. The extra parentheses are required because `hstack()` expects one argument:"]},{"cell_type":"code","execution_count":72,"metadata":{},"outputs":[{"data":{"text/plain":["array([[ 0, 96, 70, 94, 77, 90],\n"," [100, 87, 90, 100, 81, 82]])"]},"execution_count":72,"metadata":{},"output_type":"execute_result"}],"source":["np.hstack((grades, grades2))"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Next, let's assume that `grades2` represents two more students' grades on three exams. In this case, we can combine `grades` and `grades2` with `NumPy`'s `vstack()` (vertical stack) function: "]},{"cell_type":"code","execution_count":73,"metadata":{},"outputs":[{"data":{"text/plain":["array([[ 0, 96, 70],\n"," [100, 87, 90],\n"," [ 94, 77, 90],\n"," [100, 81, 82]])"]},"execution_count":73,"metadata":{},"output_type":"execute_result"}],"source":["np.vstack((grades, grades2))"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["> ### Exercise 1: Suppose we are developing a chess game and the chess game provide two special checkerboards as follows:\n","\n","
\n","\n","
\n","\n","We decide to use 1 to represent the white square and 0 to represent the black square. Write a program to create two 2D arrays to represent the two checkerboards as follows:\n","\n","```python\n","[[1, 0, 1, 0, 1, 0],\n"," [0, 1, 0, 1, 0, 1],\n"," [1, 0, 1, 0, 1, 0],\n"," [0, 1, 0, 1, 0, 1],\n"," [1, 0, 1, 0, 1, 0],\n"," [0, 1, 0, 1, 0, 1]]\n","```\n","\n","```python\n","[[1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1],\n"," [0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0],\n"," [1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1]]\n","```\n","\n","Note you should not directly hardcode the above arrays. You should use `Numpy` methods to create the arrays. After you have finished the exercise, you can print out the checkerboard using the following code cell."]},{"cell_type":"code","execution_count":77,"metadata":{},"outputs":[{"data":{"text/plain":["array([[1., 0., 1., 0., 1., 0.],\n"," [0., 1., 0., 1., 0., 1.],\n"," [1., 0., 1., 0., 1., 0.],\n"," [0., 1., 0., 1., 0., 1.],\n"," [1., 0., 1., 0., 1., 0.],\n"," [0., 1., 0., 1., 0., 1.]])"]},"execution_count":77,"metadata":{},"output_type":"execute_result"}],"source":["# Your answer here\n","checkerboard"]},{"cell_type":"code","execution_count":79,"metadata":{},"outputs":[{"data":{"text/plain":["array([[1., 0., 1., 0., 1., 0., 0., 1., 0., 1., 0., 1.],\n"," [0., 1., 0., 1., 0., 1., 1., 0., 1., 0., 1., 0.],\n"," [1., 0., 1., 0., 1., 0., 0., 1., 0., 1., 0., 1.]])"]},"execution_count":79,"metadata":{},"output_type":"execute_result"}],"source":["# Your answer here\n","checkerboard2"]},{"cell_type":"code","execution_count":80,"metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["matplotlib is already installed.\n"]},{"data":{"image/png":"iVBORw0KGgoAAAANSUhEUgAAAZgAAAGdCAYAAAAv9mXmAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/bCgiHAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAUoUlEQVR4nO3df6iW9f348det4rHVue+yzDrzaLWtwkTHLEVa+1GukIjqrwhhzslg4zgSCYb/zPrrCINobNJksfpnYltgQVDNuXkklmSKYMEio9EZptZg9308sLs45/r88aHz+fpNm7f6ui/PuR4PuKFzd9/n/bp8n+5n132dc6wVRVEEAFxg08oeAICpSWAASCEwAKQQGABSCAwAKQQGgBQCA0AKgQEgxYxuLzg+Ph5Hjx6N3t7eqNVq3V4egPNQFEWMjIxEX19fTJv2xecoXQ/M0aNHo7+/v9vLAnABDQ8Px7x5877wMV0PTG9vb0T873D1er3by5eq0WiUPULXNZvNskcoRRX3OqKa+13Vvf7stfyLdD0wn70tVq/XKxeYKrLH1WK/q+NsLnG4yA9ACoEBIIXAAJBCYABIITAApBAYAFIIDAApBAaAFAIDQAqBASCFwACQQmAASCEwAKQQGABSCAwAKQQGgBQCA0AKgQEghcAAkEJgAEghMACkEBgAUggMACnOKTBbt26N6667LmbNmhXLly+PN95440LPBcAk13Fgnnvuudi4cWNs3rw5Dh48GEuWLIl77rknTpw4kTEfAJNUx4F54okn4kc/+lGsXbs2Fi5cGL/5zW/iS1/6Uvzud7/LmA+ASaqjwHzyySdx4MCBWLly5f99gmnTYuXKlfH666+f9jntdjtardYpNwCmvo4C8/HHH8fY2FjMnTv3lPvnzp0bx44dO+1zBgcHo9FoTNz6+/vPfVoAJo307yLbtGlTNJvNidvw8HD2kgBcBGZ08uCrrroqpk+fHsePHz/l/uPHj8c111xz2uf09PRET0/PuU8IwKTU0RnMzJkzY+nSpbF79+6J+8bHx2P37t2xYsWKCz4cAJNXR2cwEREbN26MNWvWxK233hrLli2LJ598MkZHR2Pt2rUZ8wEwSXUcmIceeig++uij+PnPfx7Hjh2Lr3/96/HKK6987sI/ANVWK4qi6OaCrVYrGo1GNJvNqNfr3Vy6dLVarewRuq7LX14XjSrudUQ197uqe302r+F+FxkAKQQGgBQCA0AKgQEghcAAkEJgAEghMACkEBgAUggMACkEBoAUAgNACoEBIIXAAJBCYABIITAApBAYAFIIDAApBAaAFAIDQAqBASCFwACQQmAASCEwAKQQGABSzChr4UajUdbSpSmKouwRuq5Wq5U9QimquNcR1dzvqu11q9U669dvZzAApBAYAFIIDAApBAaAFAIDQAqBASCFwACQQmAASCEwAKQQGABSCAwAKQQGgBQCA0AKgQEghcAAkEJgAEghMACkEBgAUggMACkEBoAUAgNACoEBIIXAAJBCYABIITAApBAYAFJ0HJi9e/fGfffdF319fVGr1eKFF15IGAuAya7jwIyOjsaSJUti69atGfMAMEXM6PQJq1atilWrVmXMAsAU0nFgOtVut6Pdbk983Gq1spcE4CKQfpF/cHAwGo3GxK2/vz97SQAuAumB2bRpUzSbzYnb8PBw9pIAXATS3yLr6emJnp6e7GUAuMj4ORgAUnR8BnPy5Mk4cuTIxMfvv/9+HDp0KGbPnh3z58+/oMMBMHnViqIoOnnCnj174rvf/e7n7l+zZk08++yz//X5rVYrGo1GJ0tOGR3+UU8JtVqt7BFKUcW9jqjmfldtrz97DW82m1Gv17/wsR2fwXznO9+p3B8oAJ1zDQaAFAIDQAqBASCFwACQQmAASCEwAKQQGABSCAwAKQQGgBQCA0AKgQEghcAAkEJgAEghMACkEBgAUggMACkEBoAUAgNACoEBIIXAAJBCYABIITAApBAYAFLMKGvhZrMZ9Xq9rOVLUavVyh6h64qiKHuEUlRxryOqud9V3euz4QwGgBQCA0AKgQEghcAAkEJgAEghMACkEBgAUggMACkEBoAUAgNACoEBIIXAAJBCYABIITAApBAYAFIIDAApBAaAFAIDQAqBASCFwACQQmAASCEwAKQQGABSCAwAKQQGgBQCA0AKgQEgRUeBGRwcjNtuuy16e3vj6quvjgceeCDeeeedrNkAmMQ6CszQ0FAMDAzEvn37YteuXfHpp5/G3XffHaOjo1nzATBJ1YqiKM71yR999FFcffXVMTQ0FN/61rfO6jmtVisajUY0m82o1+vnuvSkVKvVyh6h687jy2tSq+JeR1Rzv6u612fzGj7jfBeIiJg9e/YZH9Nut6Pdbk983Gq1zmdJACaJc77IPz4+Hhs2bIjbb789Fi1adMbHDQ4ORqPRmLj19/ef65IATCLn/BbZT37yk3j55Zfjtddei3nz5p3xcac7g+nv7/cWWUVU8S2TiGrudUQ197uqe532Ftn69evjpZdeir17935hXCIienp6oqen51yWAWAS6ygwRVHET3/609i5c2fs2bMnrr/++qy5AJjkOgrMwMBAbN++PV588cXo7e2NY8eORUREo9GISy65JGVAACanjq7BnOm9xmeeeSZ+8IMfnNXn8G3K1VLF9+QjqrnXEdXc76ru9QW/BlPFLx4Azo3fRQZACoEBIIXAAJBCYABIITAApBAYAFIIDAApBAaAFAIDQAqBASCFwACQQmAASCEwAKQQGABSCAwAKQQGgBQCA0AKgQEghcAAkEJgAEghMACkEBgAUggMAClmlLVwo9Eoa+nSFEVR9ghdV6vVyh6hFFXc64hq7nfV9rrVap3167czGABSCAwAKQQGgBQCA0AKgQEghcAAkEJgAEghMACkEBgAUggMACkEBoAUAgNACoEBIIXAAJBCYABIITAApBAYAFIIDAApBAaAFAIDQAqBASCFwACQQmAASCEwAKQQGABSCAwAKToKzFNPPRWLFy+Oer0e9Xo9VqxYES+//HLWbABMYh0FZt68ebFly5Y4cOBAvPnmm3HnnXfG/fffH2+//XbWfABMUrWiKIrz+QSzZ8+OX/ziF7Fu3bqzenyr1YpGo3E+S05a5/lHPSnVarWyRyhFFfc6opr7XbW9/uw1vNlsRr1e/8LHzjjXRcbGxuKPf/xjjI6OxooVK874uHa7He12+5ThAJj6Or7If/jw4bjsssuip6cnfvzjH8fOnTtj4cKFZ3z84OBgNBqNiVt/f/95DQzA5NDxW2SffPJJfPDBB9FsNuP555+Pp59+OoaGhs4YmdOdwVQ1MlU7lY6o5lsmEdXc64hq7nfV9rqTt8jO+xrMypUr4ytf+Ups27ato+GqqGpfiBHVfMGJqOZeR1Rzv6u2150E5rx/DmZ8fPyUMxQAiOjwIv+mTZti1apVMX/+/BgZGYnt27fHnj174tVXX82aD4BJqqPAnDhxIr7//e/Hhx9+GI1GIxYvXhyvvvpqfO9738uaD4BJ6ryvwXTKNZhqqeJ78hHV3OuIau531fa6q9dgAOB0BAaAFAIDQAqBASCFwACQQmAASCEwAKQQGABSCAwAKQQGgBQCA0AKgQEghcAAkEJgAEghMACkEBgAUggMACkEBoAUAgNACoEBIIXAAJBCYABIITAApBAYAFLMKGvhZrMZ9Xq9rOVLUavVyh6h64qiKHuEUlRxryOqud9V3euz4QwGgBQCA0AKgQEghcAAkEJgAEghMACkEBgAUggMACkEBoAUAgNACoEBIIXAAJBCYABIITAApBAYAFIIDAApBAaAFAIDQAqBASCFwACQQmAASCEwAKQQGABSCAwAKQQGgBQCA0CK8wrMli1bolarxYYNGy7QOABMFeccmP3798e2bdti8eLFF3IeAKaIcwrMyZMnY/Xq1fHb3/42rrjiigs9EwBTwDkFZmBgIO69995YuXLlf31su92OVqt1yg2AqW9Gp0/YsWNHHDx4MPbv339Wjx8cHIzHH3+848EAmNw6OoMZHh6ORx55JH7/+9/HrFmzzuo5mzZtimazOXEbHh4+p0EBmFxqRVEUZ/vgF154IR588MGYPn36xH1jY2NRq9Vi2rRp0W63T/l3p9NqtaLRaESz2Yx6vX7uk09CtVqt7BG6roMvrymlinsdUc39rupen81reEdvkd11111x+PDhU+5bu3Zt3HzzzfGzn/3sv8YFgOroKDC9vb2xaNGiU+679NJL48orr/zc/QBUm5/kByBFx99F9v/bs2fPBRgDgKnGGQwAKQQGgBQCA0AKgQEghcAAkEJgAEghMACkEBgAUggMACkEBoAUAgNACoEBIIXAAJBCYABIITAApBAYAFIIDAApBAaAFAIDQAqBASCFwACQQmAASCEwAKSYUdbCjUajrKVLUxRF2SN0Xa1WK3uEUlRxryOqud9V2+tWq3XWr9/OYABIITAApBAYAFIIDAApBAaAFAIDQAqBASCFwACQQmAASCEwAKQQGABSCAwAKQQGgBQCA0AKgQEghcAAkEJgAEghMACkEBgAUggMACkEBoAUAgNACoEBIIXAAJBCYABIITAApOgoMI899ljUarVTbjfffHPWbABMYjM6fcItt9wSf/7zn//vE8zo+FMAUAEd12HGjBlxzTXXZMwCwBTS8TWYd999N/r6+uKGG26I1atXxwcffPCFj2+329FqtU65ATD1dRSY5cuXx7PPPhuvvPJKPPXUU/H+++/HHXfcESMjI2d8zuDgYDQajYlbf3//eQ8NwMWvVhRFca5P/ve//x0LFiyIJ554ItatW3fax7Tb7Wi32xMft1qtykbmPP6oJ61arVb2CKWo4l5HVHO/q7bXrVYrGo1GNJvNqNfrX/jY87pCf/nll8eNN94YR44cOeNjenp6oqen53yWAWASOq+fgzl58mS89957ce21116oeQCYIjoKzKOPPhpDQ0Pxj3/8I/72t7/Fgw8+GNOnT4+HH344az4AJqmO3iL75z//GQ8//HD861//ijlz5sQ3v/nN2LdvX8yZMydrPgAmqY4Cs2PHjqw5AJhi/C4yAFIIDAApBAaAFAIDQAqBASCFwACQQmAASCEwAKQQGABSCAwAKQQGgBQCA0AKgQEghcAAkEJgAEghMACkEBgAUggMACkEBoAUAgNACoEBIIXAAJBCYABIMaPbCxZF0e0lLxqtVqvsEegSe10dVdvrz473bF7Lux6YkZGRbi950Wg0GmWPQJfY6+qo6l6PjIz812OvFV0+pRgfH4+jR49Gb29v1Gq1rq3barWiv78/hoeHo16vd23dsjnu6hx3FY85oprHXeYxF0URIyMj0dfXF9OmffFVlq6fwUybNi3mzZvX7WUn1Ov1ynwR/r8cd3VU8ZgjqnncZR3z2Z61ucgPQAqBASBFZQLT09MTmzdvjp6enrJH6SrHXZ3jruIxR1TzuCfLMXf9Ij8A1VCZMxgAuktgAEghMACkEBgAUlQmMFu3bo3rrrsuZs2aFcuXL4833nij7JFS7d27N+67777o6+uLWq0WL7zwQtkjpRscHIzbbrstent74+qrr44HHngg3nnnnbLHSvfUU0/F4sWLJ37obsWKFfHyyy+XPVZXbdmyJWq1WmzYsKHsUVI99thjUavVTrndfPPNZY91RpUIzHPPPRcbN26MzZs3x8GDB2PJkiVxzz33xIkTJ8oeLc3o6GgsWbIktm7dWvYoXTM0NBQDAwOxb9++2LVrV3z66adx9913x+joaNmjpZo3b15s2bIlDhw4EG+++Wbceeedcf/998fbb79d9mhdsX///ti2bVssXry47FG64pZbbokPP/xw4vbaa6+VPdKZFRWwbNmyYmBgYOLjsbGxoq+vrxgcHCxxqu6JiGLnzp1lj9F1J06cKCKiGBoaKnuUrrviiiuKp59+uuwx0o2MjBRf+9rXil27dhXf/va3i0ceeaTskVJt3ry5WLJkSdljnLUpfwbzySefxIEDB2LlypUT902bNi1WrlwZr7/+eomTka3ZbEZExOzZs0uepHvGxsZix44dMTo6GitWrCh7nHQDAwNx7733nvLf91T37rvvRl9fX9xwww2xevXq+OCDD8oe6Yy6/ssuu+3jjz+OsbGxmDt37in3z507N/7+97+XNBXZxsfHY8OGDXH77bfHokWLyh4n3eHDh2PFihXxn//8Jy677LLYuXNnLFy4sOyxUu3YsSMOHjwY+/fvL3uUrlm+fHk8++yzcdNNN8WHH34Yjz/+eNxxxx3x1ltvRW9vb9njfc6UDwzVNDAwEG+99dbF/f70BXTTTTfFoUOHotlsxvPPPx9r1qyJoaGhKRuZ4eHheOSRR2LXrl0xa9asssfpmlWrVk388+LFi2P58uWxYMGC+MMf/hDr1q0rcbLTm/KBueqqq2L69Olx/PjxU+4/fvx4XHPNNSVNRab169fHSy+9FHv37i31r4boppkzZ8ZXv/rViIhYunRp7N+/P375y1/Gtm3bSp4sx4EDB+LEiRPxjW98Y+K+sbGx2Lt3b/z617+Odrsd06dPL3HC7rj88svjxhtvjCNHjpQ9ymlN+WswM2fOjKVLl8bu3bsn7hsfH4/du3dX4j3qKimKItavXx87d+6Mv/zlL3H99deXPVJpxsfHo91ulz1GmrvuuisOHz4chw4dmrjdeuutsXr16jh06FAl4hIRcfLkyXjvvffi2muvLXuU05ryZzARERs3bow1a9bErbfeGsuWLYsnn3wyRkdHY+3atWWPlubkyZOn/F/N+++/H4cOHYrZs2fH/PnzS5wsz8DAQGzfvj1efPHF6O3tjWPHjkXE//7lSJdccknJ0+XZtGlTrFq1KubPnx8jIyOxffv22LNnT7z66qtlj5amt7f3c9fWLr300rjyyiun9DW3Rx99NO67775YsGBBHD16NDZv3hzTp0+Phx9+uOzRTq/sb2Prll/96lfF/Pnzi5kzZxbLli0r9u3bV/ZIqf76178WEfG525o1a8oeLc3pjjciimeeeabs0VL98Ic/LBYsWFDMnDmzmDNnTnHXXXcVf/rTn8oeq+uq8G3KDz30UHHttdcWM2fOLL785S8XDz30UHHkyJGyxzojv64fgBRT/hoMAOUQGABSCAwAKQQGgBQCA0AKgQEghcAAkEJgAEghMACkEBgAUggMACkEBoAU/wNYeiRjcC2aMAAAAABJRU5ErkJggg==","text/plain":["
"]},"metadata":{},"output_type":"display_data"},{"data":{"image/png":"iVBORw0KGgoAAAANSUhEUgAAAhYAAACnCAYAAABNThUqAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/bCgiHAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAOeklEQVR4nO3df2hV9R/H8dfd1u6W3B2ass3L7nKFYP5omXOii37gUEQECfoBFkuhP+JazkGkhfpH6U0jEW1oBtk/WfaPWoLBWDoR/JW2SCpNElrJNoW6VxdN2T3fP8J9GWnurvfduR99PuD+sXPv3efd66j3xTnndkK+7/sCAAAwkBf0AAAA4PZBsQAAAGYoFgAAwAzFAgAAmKFYAAAAMxQLAABghmIBAADMFIz0gul0WhcuXFAkElEoFBrp5QEAwDD4vq/Lly8rGo0qL+/mxyVGvFhcuHBBsVhspJcFAAAGOjs7VVlZedPnR7xYRCIRSX8PVlJSMtLLD5nneUGPMCTJZDLoEW7JhSxdyFFyI0sXsL/tuJClCzlKuZ9lKpVSLBYb+By/mREvFtdPf5SUlOR0sXAFGdogxzsL+9sOWdpxJctbXcbAxZsAAMAMxQIAAJihWAAAADMUCwAAYIZiAQAAzFAsAACAGYoFAAAwQ7EAAABmKBYAAMAMxQIAAJihWAAAADMUCwAAYIZiAQAAzFAsAACAmWEVi5aWFo0bN05FRUWaMWOGjh8/bj0XAABwUMbFYteuXWpubtaaNWt06tQp1dTUaO7cuerp6cnGfAAAwCEZF4uNGzfqxRdf1OLFizVx4kRt27ZNd999tz788MNszAcAABySUbG4evWqTp48qYaGhv//grw8NTQ06MiRIzd8T19fn1Kp1KAHAAC4PWVULC5duqT+/n6Vl5cP2l5eXq6urq4bvieRSMjzvIFHLBYb/rQAACCnZf1bIStXrlQymRx4dHZ2ZntJAAAQkIJMXjxmzBjl5+eru7t70Pbu7m5VVFTc8D3hcFjhcHj4EwIAAGdkdMSisLBQ06ZNU1tb28C2dDqttrY2zZw503w4AADgloyOWEhSc3OzGhsbVVtbq7q6Om3atEm9vb1avHhxNuYDAAAOybhYPPPMM7p48aJWr16trq4uPfTQQ/ryyy//cUEnAAC484R83/dHcsFUKiXP85RMJlVSUjKSS2ckFAoFPcKQjPDuGxYXsnQhR8mNLF3A/rbjQpYu5CjlfpZD/fzmXiEAAMAMxQIAAJihWAAAADMUCwAAYIZiAQAAzFAsAACAGYoFAAAwQ7EAAABmKBYAAMAMxQIAAJihWAAAADMUCwAAYIZiAQAAzGR823QrnucFtfSQ5Ppd5q5z4a59LmTpQo6SG1m6gP1tx4UsXchRciPLoeCIBQAAMEOxAAAAZigWAADADMUCAACYoVgAAAAzFAsAAGCGYgEAAMxQLAAAgBmKBQAAMEOxAAAAZigWAADADMUCAACYoVgAAAAzFAsAAGCGYgEAAMxQLAAAgBmKBQAAMJNxsTh06JAWLFigaDSqUCikPXv2ZGEsAADgooyLRW9vr2pqatTS0pKNeQAAgMMKMn3DvHnzNG/evCG/vq+vT319fQM/p1KpTJcEAACOyPo1FolEQp7nDTxisVi2lwQAAAHJerFYuXKlksnkwKOzszPbSwIAgIBkfCokU+FwWOFwONvLAACAHMDXTQEAgBmKBQAAMJPxqZArV67o3LlzAz+fP39eHR0dKi0tVVVVlelwAADALSHf9/1M3nDw4EE98cQT/9je2Niojz766JbvT6VS8jwvkyUDkWEsgQmFQkGPcEsuZOlCjpIbWbqA/W3HhSxdyFFyI0tJSiaTKikpuenzGR+xePzxx53ZSQAAYGRxjQUAADBDsQAAAGYoFgAAwAzFAgAAmKFYAAAAMxQLAABghmIBAADMUCwAAIAZigUAADBDsQAAAGYoFgAAwAzFAgAAmKFYAAAAMxnf3dTKrW67GjRXbl/rwp1mXcjShRwlN7J0AfvbjgtZupCjlPtZplIpeZ53y9dxxAIAAJihWAAAADMUCwAAYIZiAQAAzFAsAACAGYoFAAAwQ7EAAABmKBYAAMAMxQIAAJihWAAAADMUCwAAYIZiAQAAzFAsAACAGYoFAAAwQ7EAAABmKBYAAMBMRsUikUho+vTpikQiKisr08KFC3XmzJlszQYAAByTUbFob29XPB7X0aNH1draqmvXrmnOnDnq7e3N1nwAAMAhId/3/eG++eLFiyorK1N7e7seffTRIb0nlUrJ8zwlk0mVlJQMd+msC4VCQY8wJP9h940YF7J0IUfJjSxdwP6240KWLuQo5X6WQ/38LvgviySTSUlSaWnpTV/T19envr6+QYMBAIDb07Av3kyn02pqalJ9fb0mT55809clEgl5njfwiMViw10SAADkuGGfCnnppZe0f/9+HT58WJWVlTd93Y2OWMRiMU6FGMn1Q2eSG1m6kKPkRpYuYH/bcSFLF3KUcj/LrJ4KWbp0qfbt26dDhw79a6mQpHA4rHA4PJxlAACAYzIqFr7v6+WXX9bu3bt18OBBVVdXZ2suAADgoIyKRTwe186dO7V3715FIhF1dXVJkjzPU3FxcVYGBAAA7sjoGoubnafasWOHXnjhhSH9Dr5uaivXz8lJbmTpQo6SG1m6gP1tx4UsXchRyv0ss3KNRa7/RwMAgGBxrxAAAGCGYgEAAMxQLAAAgBmKBQAAMEOxAAAAZigWAADADMUCAACYoVgAAAAzFAsAAGCGYgEAAMxQLAAAgBmKBQAAMJPRTcgsXL+RWSqVGumlb0vkaIMc7yzsbztkaSfXs7w+361uSJrRbdMt/Prrr4rFYiO5JAAAMNLZ2anKysqbPj/ixSKdTuvChQuKRCIKhUL/+felUinFYjF1dnb+6/3hcWtkaYcsbZCjHbK0c6dm6fu+Ll++rGg0qry8m19JMeKnQvLy8v616QxXSUnJHbWDs4ks7ZClDXK0Q5Z27sQsPc+75Wu4eBMAAJihWAAAADPOF4twOKw1a9YoHA4HPYrzyNIOWdogRztkaYcs/92IX7wJAABuX84fsQAAALmDYgEAAMxQLAAAgBmKBQAAMEOxAAAAZpwvFi0tLRo3bpyKioo0Y8YMHT9+POiRnJNIJDR9+nRFIhGVlZVp4cKFOnPmTNBjOe/tt99WKBRSU1NT0KM46bffftNzzz2n0aNHq7i4WFOmTNHXX38d9FhO6e/v16pVq1RdXa3i4mLdf//9evPNN295EylIhw4d0oIFCxSNRhUKhbRnz55Bz/u+r9WrV2vs2LEqLi5WQ0ODfvrpp2CGzTFOF4tdu3apublZa9as0alTp1RTU6O5c+eqp6cn6NGc0t7erng8rqNHj6q1tVXXrl3TnDlz1NvbG/Rozjpx4oTef/99Pfjgg0GP4qTff/9d9fX1uuuuu7R//359//33evfdd3XPPfcEPZpT1q9fr61bt+q9997TDz/8oPXr12vDhg3asmVL0KPlvN7eXtXU1KilpeWGz2/YsEGbN2/Wtm3bdOzYMY0aNUpz587VX3/9NcKT5iDfYXV1dX48Hh/4ub+/349Go34ikQhwKvf19PT4kvz29vagR3HS5cuX/fHjx/utra3+Y4895i9btizokZzz2muv+Y888kjQYzhv/vz5/pIlSwZte/LJJ/1FixYFNJGbJPm7d+8e+DmdTvsVFRX+O++8M7Dtjz/+8MPhsP/JJ58EMGFucfaIxdWrV3Xy5Ek1NDQMbMvLy1NDQ4OOHDkS4GTuSyaTkqTS0tKAJ3FTPB7X/PnzB/3ZRGY+//xz1dbW6qmnnlJZWZmmTp2qDz74IOixnDNr1iy1tbXp7NmzkqRvv/1Whw8f1rx58wKezG3nz59XV1fXoL/jnudpxowZfP4ogLubWrl06ZL6+/tVXl4+aHt5ebl+/PHHgKZyXzqdVlNTk+rr6zV58uSgx3HOp59+qlOnTunEiRNBj+K0n3/+WVu3blVzc7Nef/11nThxQq+88ooKCwvV2NgY9HjOWLFihVKplCZMmKD8/Hz19/dr7dq1WrRoUdCjOa2rq0uSbvj5c/25O5mzxQLZEY/Hdfr0aR0+fDjoUZzT2dmpZcuWqbW1VUVFRUGP47R0Oq3a2lqtW7dOkjR16lSdPn1a27Zto1hk4LPPPtPHH3+snTt3atKkSero6FBTU5Oi0Sg5ImucPRUyZswY5efnq7u7e9D27u5uVVRUBDSV25YuXap9+/bpwIEDqqysDHoc55w8eVI9PT16+OGHVVBQoIKCArW3t2vz5s0qKChQf39/0CM6Y+zYsZo4ceKgbQ888IB++eWXgCZy06uvvqoVK1bo2Wef1ZQpU/T8889r+fLlSiQSQY/mtOufMXz+3JizxaKwsFDTpk1TW1vbwLZ0Oq22tjbNnDkzwMnc4/u+li5dqt27d+urr75SdXV10CM5afbs2fruu+/U0dEx8KitrdWiRYvU0dGh/Pz8oEd0Rn19/T++8nz27Fnde++9AU3kpj///FN5eYP/mc/Pz1c6nQ5oottDdXW1KioqBn3+pFIpHTt2jM8fOX4qpLm5WY2NjaqtrVVdXZ02bdqk3t5eLV68OOjRnBKPx7Vz507t3btXkUhk4Byh53kqLi4OeDp3RCKRf1yXMmrUKI0ePZrrVTK0fPlyzZo1S+vWrdPTTz+t48ePa/v27dq+fXvQozllwYIFWrt2raqqqjRp0iR988032rhxo5YsWRL0aDnvypUrOnfu3MDP58+fV0dHh0pLS1VVVaWmpia99dZbGj9+vKqrq7Vq1SpFo1EtXLgwuKFzRdBfS/mvtmzZ4ldVVfmFhYV+XV2df/To0aBHco6kGz527NgR9GjO4+umw/fFF1/4kydP9sPhsD9hwgR/+/btQY/knFQq5S9btsyvqqryi4qK/Pvuu89/4403/L6+vqBHy3kHDhy44b+LjY2Nvu///ZXTVatW+eXl5X44HPZnz57tnzlzJtihc0TI9/lfsAEAABvOXmMBAAByD8UCAACYoVgAAAAzFAsAAGCGYgEAAMxQLAAAgBmKBQAAMEOxAAAAZigWAADADMUCAACYoVgAAAAz/wNMo/wPr7lKlAAAAABJRU5ErkJggg==","text/plain":["
"]},"metadata":{},"output_type":"display_data"}],"source":["# Plot the checkerboard\n","package_name = \"matplotlib\"\n","\n","try:\n"," __import__(package_name)\n"," print(f\"{package_name} is already installed.\")\n","except ImportError:\n"," print(f\"{package_name} not found. Installing...\")\n"," %pip install {package_name}\n","\n","import matplotlib.pyplot as plt\n","plt.imshow(checkerboard, cmap='gray')\n","plt.show()\n","plt.imshow(checkerboard2, cmap='gray');"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["## `NumPy` calculation methods (Reduction)"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["An `array` includes several methods that carry out computations based on its contents. **By default, these methods disregard the array's shape and utilize all the elements in the calculations.** For instance, when computing the mean of an array, it sums all of its elements irrespective of its shape, and then divides by the total number of elements. **We can also execute these calculations on each dimension.** For example, in a two-dimensional array, we can determine the mean of each row and each column."]},{"cell_type":"code","execution_count":83,"metadata":{},"outputs":[{"data":{"text/plain":["array([[ 87, 96, 70],\n"," [100, 87, 90],\n"," [ 94, 77, 90],\n"," [100, 81, 82]])"]},"execution_count":83,"metadata":{},"output_type":"execute_result"}],"source":["grades = np.array([[87, 96, 70], [100, 87, 90],\n"," [94, 77, 90], [100, 81, 82]])\n","grades"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["We can use methods to calculate `sum()`, `min()`, `max()`, `mean()`, `std()` (standard deviation) and `var()` (variance) — each is a functional-style programming reduction:"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["1054\n","70\n","100\n","87.83333333333333\n","8.792357792739987\n","77.30555555555556\n"]}],"source":["print(grades.sum())\n","print(grades.min())\n","print(grades.max())\n","print(grades.mean())\n","print(grades.std())\n","print(grades.var())"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["### Calculations by Row or Column"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Numerous calculation methods can be applied to specific `array` dimensions, referred to as the `array`'s ***axes***. These methods accept an `axis` keyword argument that designates the dimension to be utilized in the calculation, providing a convenient means to perform computations by row or column in a two-dimensional `array`."]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Suppose we want to find the maximum grade for each exam, represented by the columns of `grades`. By specifying `axis=0`, the calculation is performed on all the row values within each column:"]},{"cell_type":"code","execution_count":85,"metadata":{},"outputs":[{"data":{"text/plain":["(array([[ 87, 96, 70],\n"," [100, 87, 90],\n"," [ 94, 77, 90],\n"," [100, 81, 82]]),\n"," array([100, 96, 90]),\n"," array([1, 0, 1], dtype=int64))"]},"execution_count":85,"metadata":{},"output_type":"execute_result"}],"source":["grades, grades.max(axis=0), grades.argmax(axis=0)"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Here, 100 is the maximum value in the first column and its corresponding index (row) is 1 (if there are duplicate elements, the index of the first element will be reported). 96 and 90 are the maximum values in the second and third columns, respectively."]},{"cell_type":"code","execution_count":86,"metadata":{},"outputs":[{"data":{"text/plain":["(array([[ 87, 96, 70],\n"," [100, 87, 90],\n"," [ 94, 77, 90],\n"," [100, 81, 82]]),\n"," array([95.25, 85.25, 83. ]))"]},"execution_count":86,"metadata":{},"output_type":"execute_result"}],"source":["grades, grades.mean(axis=0)"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Hence, 95.25 above represents the average of the first column's grades (87, 100, 94, and 100), 85.25 is the average of the second column's grades (96, 87, 77, and 81), and 83 is the average of the third column's grades (70, 90, 90, and 82). Similarly, specifying `axis=1` performs the calculation on all the column values within each individual row. To determine each student's average grade for all exams, we can use:"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["
"]},{"cell_type":"code","execution_count":87,"metadata":{},"outputs":[{"data":{"text/plain":["(array([[ 87, 96, 70],\n"," [100, 87, 90],\n"," [ 94, 77, 90],\n"," [100, 81, 82]]),\n"," array([84.33333333, 92.33333333, 87. , 87.66666667]))"]},"execution_count":87,"metadata":{},"output_type":"execute_result"}],"source":["grades, grades.mean(axis=1)"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["This generates four averages — one for the values in each row. Therefore, 84.33333333 is the average of row 0's grades (87, 96, and 70), and the other averages correspond to the remaining rows. For more methods, refer to [https://numpy.org/doc/stable/reference/arrays.ndarray.html](https://numpy.org/doc/stable/reference/arrays.ndarray.html)."]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["> For more operations such as methods related to linear algebra, we can use the sub-module `numpy.linalg`, which implements basic linear algebra, such as solving linear systems, singular value decomposition, etc. However, it is not guaranteed to be compiled using efficient routines, and thus we recommend the use of `scipy.linalg`, which will introduce in a later chapter."]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["## `array` Operators"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["### The slowness of loops"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["The speed of computations on `NumPy` `arrays` can range from very fast to very slow. To optimize performance, the recommended approach is to use ***vectorized operations***, which are typically implemented through `NumPy`'s universal functions (`ufuncs`). In scenarios that involve executing numerous small operations repeatedly, the inherent sluggishness of `Python` often becomes apparent. One such instance is when we loop over `arrays` to perform operations on each element. For example, suppose we have an array of values and need to compute the reciprocal of each value. A straightforward approach might involve:"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[{"data":{"text/plain":["array([0.11111111, 0.25 , 0.16666667, 0.14285714, 0.5 ])"]},"metadata":{},"output_type":"display_data"}],"source":["def compute_reciprocals(values):\n"," output = np.empty(len(values))\n"," for i in range(len(values)):\n"," output[i] = 1.0 / values[i]\n"," return output\n","\n","values = np.random.randint(1, 10, 5)\n","compute_reciprocals(values)"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["But if we measure the execution time of this code for a large input, we see that this operation is very slow:"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["big_array = np.random.randint(1, 10, 1_000_000)"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["1.33 s ± 3.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"]}],"source":["%%timeit \n","compute_reciprocals(big_array)"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["> Interestingly, the bottleneck in this situation isn't the operations themselves, but rather the type checking and function dispatches that `Python` needs to execute during each iteration of the loop. Whenever the reciprocal is calculated, `Python` initially verifies the type of the object and performs a dynamic lookup to determine the correct function to employ for that type. If we were using compiled code, this kind of specification would be predetermined before the code execution, resulting in much more efficient computations."]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["In `NumPy`, ***vectorization*** is the process of performing operations on entire `arrays` of data, as opposed to individual elements. This is accomplished by applying an operation to the entire `array`, instead of looping through each element of the `array` one at a time."]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[{"data":{"text/plain":["array([0.11111111, 0.25 , 0.16666667, 0.14285714, 0.5 ])"]},"metadata":{},"output_type":"display_data"}],"source":["1.0 / values # The vectorized version of the above code"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["The above syntax is the vectorized version of the original code and works due to the ***broadcasting***. Looking at the execution time for our big `array`, we see that it completes orders of magnitude faster than the `Python` loop:"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["2.04 ms ± 19.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"]}],"source":["%%timeit \n","(1.0 / big_array)"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["The execution time is much faster since the vectorization operation is done via `ufuncs`, which is a compiled routine. Now we will introduce each concept in detail, including broadcasting, `ufuncs` and vectorization."]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["#### Element-wise arithmetic"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["`NumPy` offers numerous operators that allow us to create simple expressions that carry out operations on whole arrays and returns another `array`. Firstly, let's perform **element-wise arithmetic with arrays and numeric values** by employing arithmetic operators and augmented assignments. Element-wise operations are applied to each element, so the snippet below doubles every element and cubes every element. Each operation returns a new array containing the result:"]},{"cell_type":"code","execution_count":88,"metadata":{},"outputs":[{"data":{"text/plain":["array([ 2, 4, 6, 8, 10, 12])"]},"execution_count":88,"metadata":{},"output_type":"execute_result"}],"source":["numbers = np.arange(1, 7) # array([1, 2, 3, 4, 5, 6])\n","numbers * 2"]},{"cell_type":"code","execution_count":89,"metadata":{},"outputs":[{"data":{"text/plain":["array([ 1, 8, 27, 64, 125, 216], dtype=int32)"]},"execution_count":89,"metadata":{},"output_type":"execute_result"}],"source":["numbers ** 3"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Augmented assignments modify every element in the left operand in place!"]},{"cell_type":"code","execution_count":90,"metadata":{},"outputs":[{"data":{"text/plain":["array([11, 12, 13, 14, 15, 16])"]},"execution_count":90,"metadata":{},"output_type":"execute_result"}],"source":["numbers += 10\n","numbers"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["### Broadcasting "]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Typically, arithmetic operations necessitate two `arrays` of identical size and shape as operands. When one operand is a single value, known as a scalar, `NumPy` carries out the element-wise calculations **as though the scalar were an array of the same shape as the other operand, but with the scalar value present in all its elements.** This is referred to as ***broadcasting***. The snippets above demonstrate this capability. For instance, `numbers * 2` is equivalent to `numbers * [2, 2, 2, 2, 2, 2]`."]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Broadcasting can also be applied between `arrays` of varying sizes and shapes, enabling concise and powerful manipulations. We will present more examples of broadcasting later in this chapter when we introduce `NumPy`'s universal functions."]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["#### Arithmetic Operations Between `arrays`"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Arithmetic operations and augmented assignments can be performed between arrays of the same shape. Let's multiply the one-dimensional arrays `numbers` and `numbers2` (created below), each containing five elements:"]},{"cell_type":"code","execution_count":91,"metadata":{},"outputs":[{"data":{"text/plain":["array([ 12.1, 26.4, 42.9, 61.6, 82.5, 105.6])"]},"execution_count":91,"metadata":{},"output_type":"execute_result"}],"source":["numbers2 = np.linspace(1.1, 6.6, 6) \n","numbers * numbers2 # array([11, 12, 13, 14, 15, 16]) * array([ 1.1, 2.2, 3.3, 4.4, 5.5, 6.6])"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["The outcome is a new `array` created by multiplying the elements of each operand element-wise — `11 * 1.1, 12 * 2.2, 13 * 3.3`, and so on. Arithmetic operations between `arrays` of integers and floating-point numbers result in an `array` of floating-point numbers. Let's see another example:"]},{"cell_type":"code","execution_count":92,"metadata":{},"outputs":[{"data":{"text/plain":["array([[1., 1., 1.],\n"," [1., 1., 1.],\n"," [1., 1., 1.]])"]},"execution_count":92,"metadata":{},"output_type":"execute_result"}],"source":["c = np.ones((3, 3))\n","c * c "]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Note that the above operation is not matrix multiplication. To perform matrix multiplication use the `dot()` method!"]},{"cell_type":"code","execution_count":93,"metadata":{},"outputs":[{"data":{"text/plain":["array([[3., 3., 3.],\n"," [3., 3., 3.],\n"," [3., 3., 3.]])"]},"execution_count":93,"metadata":{},"output_type":"execute_result"}],"source":["c.dot(c)"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["The above operation is the same as using the `@` operator:"]},{"cell_type":"code","execution_count":94,"metadata":{},"outputs":[{"data":{"text/plain":["array([[3., 3., 3.],\n"," [3., 3., 3.],\n"," [3., 3., 3.]])"]},"execution_count":94,"metadata":{},"output_type":"execute_result"}],"source":["c @ c"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["We can apply broadcasting to higher-dimensional `arrays` in a similar way. For instance, consider adding a one-dimensional `arra`y to a two-dimensional `array` and observe the resulting output:"]},{"cell_type":"code","execution_count":95,"metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["(3,) (3, 3)\n"]},{"data":{"text/plain":["array([[1., 2., 3.],\n"," [1., 2., 3.],\n"," [1., 2., 3.]])"]},"execution_count":95,"metadata":{},"output_type":"execute_result"}],"source":["a = np.array([0, 1, 2])\n","M = np.ones((3, 3))\n","print(a.shape, M.shape)\n","M + a"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Here, the one-dimensional array `a` is stretched, or broadcasted, across the second dimension in order to match the shape of `M`."]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["#### Rules of Broadcasting"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["In `NumPy`, broadcasting adheres to a strict set of regulations that govern how two `arrays` interact with one another. These rules are as follows:\n","\n","1. When the number of dimensions between two `arrays` differs, the `array` with fewer dimensions is padded with ones on its leading (left) side to match the number of dimensions of the other `array`.\n","2. If the shape of the two `arrays` doesn't match in any dimension, the `array` with a shape of 1 in that dimension is expanded to match the shape of the other `array`.\n","3. If the sizes of the `arrays` conflict in any dimension and neither is equal to 1, an error is raised."]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Now let's take a look at an example where both `arrays` need to be broadcast:"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["(4, 1) (3,)\n"]},{"data":{"text/plain":["(array([[ 0],\n"," [10],\n"," [20],\n"," [30]]),\n"," array([0, 1, 2]))"]},"metadata":{},"output_type":"display_data"}],"source":["a = np.arange(0, 40, 10).reshape(4,1)\n","b = np.arange(3)\n","print(a.shape, b.shape)\n","a, b"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[{"data":{"text/plain":["array([[ 0, 1, 2],\n"," [10, 11, 12],\n"," [20, 21, 22],\n"," [30, 31, 32]])"]},"metadata":{},"output_type":"display_data"}],"source":["a + b"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["1. To begin, we need to determine the shapes of the two `arrays`: `a.shape` is `(4,1)` and `b.shape` is `(3,)`. According to Rule 1, we have to add ones to the shape of `b` such that its dimensions match those of `a`. Thus, `b.shape` becomes `(1,3)`.\n","\n","2. Next, Rule 2 states that we need to expand each of the 1s in `b.shape` to match the corresponding size of the other `array`. Consequently, `a.shape` becomes `(4,3)`, and `b.shape` becomes `(4,3)` since 1 was replicated three times to match the size of `a`.\n","\n","3. Since the shapes of the two `arrays` now match, they are compatible. \n","\n","This entire process can be depicted visually as follows:"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["
"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Next, let's look at an example in which the two `arrays` are incompatible!"]},{"cell_type":"code","execution_count":96,"metadata":{},"outputs":[{"data":{"text/plain":["((3, 2), (3,))"]},"execution_count":96,"metadata":{},"output_type":"execute_result"}],"source":["M = np.ones((3, 2))\n","a = np.arange(3)\n","\n","M.shape, a.shape"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["1. First, we need to determine the shapes of the two `arrays`: `M.shape` is `(3,2)`, and `a.shape` is `(3,)`. As per Rule 1, we must pad ones to the shape of `a` such that its number of dimensions matches that of `M`. Consequently, `a.shape` becomes `(1, 3)`, while `M.shape` remains the same.\n","\n","2. Next, Rule 2 requires that we stretch the first dimension of `a` to match that of `M`. Therefore, `a.shape` becomes `(3,3)`, while `M.shape` stays the same.\n","\n","3. However, Rule 3 comes into play, here since the final shapes of the two `arrays` do not match. As a result, these two `arrays` are incompatible. \n","\n","This incompatibility is evident when we attempt to perform this operation."]},{"cell_type":"code","execution_count":97,"metadata":{},"outputs":[{"ename":"ValueError","evalue":"operands could not be broadcast together with shapes (3,2) (3,) ","output_type":"error","traceback":["\u001b[1;31m---------------------------------------------------------------------------\u001b[0m","\u001b[1;31mValueError\u001b[0m Traceback (most recent call last)","\u001b[1;32m~\\AppData\\Local\\Temp\\ipykernel_6036\\3374645918.py\u001b[0m in \u001b[0;36m\u001b[1;34m\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mM\u001b[0m \u001b[1;33m+\u001b[0m \u001b[0ma\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m","\u001b[1;31mValueError\u001b[0m: operands could not be broadcast together with shapes (3,2) (3,) "]}],"source":["M + a"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["#### Comparing arrays"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Both individual values and other arrays can be compared in `NumPy`. The comparisons are carried out element-wise, producing arrays of Boolean values where the value of each element represents the outcome of the comparison, either `True` or `False`:"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[{"data":{"text/plain":["array([False, False, True, True, True, True])"]},"metadata":{},"output_type":"display_data"}],"source":["numbers >= 13 # numbers = array([11, 12, 13, 14, 15, 16])"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Note that the above expression implicitly used broadcasting!"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[{"data":{"text/plain":["array([ True, True, True, True, True, True])"]},"metadata":{},"output_type":"display_data"}],"source":["numbers2 < numbers # numbers2 = array([ 1.1, 2.2, 3.3, 4.4, 5.5, 6.6])"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[{"data":{"text/plain":["array([False, False, False, False, False, False])"]},"metadata":{},"output_type":"display_data"}],"source":["numbers == numbers2"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[{"data":{"text/plain":["array([ True, True, True, True, True, True])"]},"metadata":{},"output_type":"display_data"}],"source":["numbers == numbers"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["### Universal Functions (Vectorization)"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Now we will delve into how `NumPy` perform element-wise operations on `arrays` without using the `for` loop: `NumPy` provides more operators/functions as standalone ***universal functions*** (also known as `ufuncs`) that perform various operations element-wise, meaning that they apply the same operation to each element in an `array`. These functions operate on one or two `array`-like arguments (such as `lists`) and are utilized to perform tasks. Some of these functions are automatically invoked when operators like `+` and `*` are used with `arrays`. Each `ufunc` generates a new `array` that contains the results of the operation."]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["`NumPy` offers a practical interface for various kinds of operations that directly access statically typed and compiled routines. These operations are called ***vectorized operations***. Vectorization is achieved using `array` operations, such as addition, subtraction, multiplication, and division. In addition, it can also be achieved by using `ufunc`. These vectorized methods are intended to move the loop to the compiled layer that underpins `NumPy`, leading to considerably quicker execution."]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["We can view the complete list, their descriptions and more information about universal functions at [https://numpy.org/doc/stable/reference/ufuncs.html](https://numpy.org/doc/stable/reference/ufuncs.html)\n","\n","> See [here](https://www.labri.fr/perso/nrougier/from-python-to-numpy/#problem-vectorization) for more vectorization examples in different thinking levels."]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["#### Exploring `NumPy`’s `Ufuncs`"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Let's add two `arrays` with the same shape, using the `add()` universal function:"]},{"cell_type":"code","execution_count":103,"metadata":{},"outputs":[{"data":{"text/plain":["array([21, 32, 43, 54, 65, 76])"]},"execution_count":103,"metadata":{},"output_type":"execute_result"}],"source":["numbers2 = np.arange(1, 7) * 10 # array([10, 20, 30, 40, 50, 60])\n","np.add(numbers, numbers2) # equivalent to numbers + numbers2, numbers = array([11, 12, 13, 14, 15, 16])"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["#### Broadcasting with Universal Functions"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Let's use the `multiply()` universal function to multiply every element of `numbers2` by the scalar value 5:"]},{"cell_type":"code","execution_count":104,"metadata":{},"outputs":[{"data":{"text/plain":["array([ 50, 100, 150, 200, 250, 300])"]},"execution_count":104,"metadata":{},"output_type":"execute_result"}],"source":["np.multiply(numbers2, 5) # equivalent to numbers2 * 5"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Let's reshape `numbers2` into a 2-by-3 array, then multiply its values by a one-dimensional `array` of three elements:"]},{"cell_type":"code","execution_count":105,"metadata":{},"outputs":[{"data":{"text/plain":["(array([[10, 20, 30],\n"," [40, 50, 60]]),\n"," array([2, 4, 6]))"]},"execution_count":105,"metadata":{},"output_type":"execute_result"}],"source":["numbers3 = numbers2.reshape(2, 3)\n","numbers4 = np.array([2, 4, 6])\n","numbers3, numbers4"]},{"cell_type":"code","execution_count":106,"metadata":{},"outputs":[{"data":{"text/plain":["array([[ 20, 80, 180],\n"," [ 80, 200, 360]])"]},"execution_count":106,"metadata":{},"output_type":"execute_result"}],"source":["np.multiply(numbers3, numbers4) # Equivalent to numbers3 * numbers4"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["In this case, `numbers4` has the same length as each row of `numbers3`, allowing `NumPy` to apply the multiplication operation by treating `numbers4` as an `array` with the following values:\n","\n","```python\n","array([[2, 4, 6],\n"," [2, 4, 6]])\n","```\n","\n","If a universal function receives two `arrays` with different shapes that do not support broadcasting, a `ValueError` is raised. Vectorization and `ufunc` functions are closely associated with broadcasting in `NumPy`, as they are frequently employed together to perform element-wise operations on arrays with varying shapes. By combining vectorization, `ufunc` functions, and broadcasting, we can effectively execute complex arithmetic operations on `NumPy` `arrays`."]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["There are other special mathematical `ufunc`. Let's create an array and calculate the square root of its values using the `sin()` universal function:"]},{"cell_type":"code","execution_count":107,"metadata":{},"outputs":[{"data":{"text/plain":["array([ 0.84147098, -0.7568025 , 0.41211849, -0.28790332, -0.13235175,\n"," -0.99177885])"]},"execution_count":107,"metadata":{},"output_type":"execute_result"}],"source":["numbers = np.array([1, 4, 9, 16, 25, 36])\n","np.sin(numbers)"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["#### Create our own vectorizing functions"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["The vectorized operation are often more concise, and it is thus advisable to avoid element-wise looping over vectors and matrices and instead employ vectorized algorithms. The initial step in converting a scalar algorithm to a vectorized algorithm involves verifying that the functions we create can function with vector inputs:"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["def Theta(x, th):\n"," \"\"\"\n"," Scalar implemenation of a variant of Heaviside step function.\n"," \"\"\"\n"," if x >= th:\n"," return 1\n"," else:\n"," return 0"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["We can achieve this using `np.vectorize` function:"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[{"data":{"text/plain":["array([0, 0, 0, 0, 1, 1, 1])"]},"metadata":{},"output_type":"display_data"}],"source":["Theta_vec = np.vectorize(Theta)\n","Theta_vec(np.array([-3,-2,-1,0,1,2,3]), 1)"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["> Don’t assume `np.vectorize()` is faster. It is mainly for convenient and concise purposes as described [here](https://numpy.org/doc/stable/reference/generated/numpy.vectorize.html)."]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["> ### Exercise 2: Suppose we are dealing with a spreadsheet that records the grade of students. The grade contains the homework, midterm and finals as follows:\n","\n","
\n","\n","| Name | HW1 | HW2 | HW3 | HW4 | Midterm | Final |\n","| --- | --- | --- | --- | --- | --- | --- |\n","| Alice | 90 | 80 | 70 | 100 | 90 | 95 |\n","| Bob | 80 | 90 | 100 | 70 | 85 | 80 |\n","| Charlie | 70 | 100 | 90 | 80 | 95 | 90 |\n","| David | 60 | 70 | 80 | 90 | 85 | 100 |\n","| Eve | 50 | 60 | 70 | 80 | 75 | 90 |\n","\n","
\n","\n","We would like to calculate the semester score of each student by the following rules:\n","\n","1. The weight of each score is 0.2 (the summation of four homework accounts for 20% of the total scores and each homework has the same weight), 0.4 and 0.4 for HW, Midterm and Final, respectively. \n","\n","2. We adjust each student's score so that the top performer in the class gets a score of 100 by adding the same constant score to each student's score.\n","\n","We use a 2D array to model the grades so that each row corresponds to a student's score. Use the following template to complete the task:\n","\n","```python\n","grades = np.array([[90, 80, 70, 100, 90, 95],\n"," [80, 90, 100, 70, 85, 80],\n"," [70, 100, 90, 80, 95, 90],\n"," [60, 70, 80, 90, 85, 100],\n"," [50, 60, 70, 80, 75, 90]])\n","\n","weights = np.array([])\n","scores\n","```"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Your code here"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["### Type casting"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Due to the nature of static typing, the type of a `NumPy` `array` does not change once created. However, we can explicitly convert an `array` of one type to another using the `astype()` function. This operation always generates a new `array` with a new type."]},{"cell_type":"code","execution_count":112,"metadata":{},"outputs":[{"data":{"text/plain":["dtype('float64')"]},"execution_count":112,"metadata":{},"output_type":"execute_result"}],"source":["M = np.random.rand(5,5)\n","M.dtype"]},{"cell_type":"code","execution_count":113,"metadata":{},"outputs":[{"data":{"text/plain":["array([[0, 0, 0, 0, 0],\n"," [0, 0, 0, 0, 0],\n"," [0, 0, 0, 0, 0],\n"," [0, 0, 0, 0, 0],\n"," [0, 0, 0, 0, 0]], dtype=int64)"]},"execution_count":113,"metadata":{},"output_type":"execute_result"}],"source":["M2 = M.astype(np.int64)\n","M2"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["See [https://scipy-lectures.org/intro/numpy/elaborate_arrays.html](https://scipy-lectures.org/intro/numpy/elaborate_arrays.html) for more details."]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["## File I/O"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["`NumPy` has its own binary format, not portable but with efficient I/O. This is useful when storing and reading back `array` data. Use the functions `numpy.save()` and `numpy.load()`."]},{"cell_type":"code","execution_count":114,"metadata":{},"outputs":[{"data":{"text/plain":["array([[0.38791019, 0.93061564, 0.33040029, 0.9810319 , 0.92882123],\n"," [0.87579212, 0.57436978, 0.99856211, 0.83380903, 0.34188505],\n"," [0.30067505, 0.99334148, 0.89929337, 0.78057549, 0.10613898],\n"," [0.71493999, 0.02124826, 0.95090779, 0.12435095, 0.55611604],\n"," [0.05511708, 0.7047183 , 0.26478257, 0.21195283, 0.77090752]])"]},"execution_count":114,"metadata":{},"output_type":"execute_result"}],"source":["np.save(\"random-matrix.npy\", M)\n","M2 = np.load(\"random-matrix.npy\")\n","M2"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["In summary:\n","\n","To make the code faster using `NumPy` \n","\n","- In place operations: `a *= 3` instead of `a = 3*a`\n","- Use views instead of copies whenever possible\n","- Broadcasting: Use broadcasting to do operations on `arrays`\n","- Vectorizing `for` loops: Find tricks to avoid `for` loops using `NumPy` `arrays`.\n","\n","The comparisons between `list` and `array` are summarized as follows:\n","\n","\n","**Python** objects:\n","\n","- `Python` `lists` are very general. They can contain any kind of object and are dynamically typed \n","- However, they do not support mathematical functions such as matrix multiplications. Implementing such functions for `Python` `lists` would not be very efficient because of the dynamic typing\n","\n","**NumPy** provides:\n","\n","- `Numpy` `arrays` are **statically typed** and **homogeneous**. The type of the elements is determined when the `array` is created\n","- Because of the static typing, fast implementation of mathematical functions such as multiplication and addition of `NumPy` arrays can be implemented in a compiled language (C and Fortran is used). Moreover, `Numpy` `arrays` are memory efficient"]}],"metadata":{"colab":{"collapsed_sections":["SwFKFBMwRzoa","n3ezAAdIsgpj","gASYx5-Cxzyg","2kLRVhae3rSa","DPpfFEA07W0A","uloOTsPL9A74","MTJO4K049nuU","3hD9uBt6AW9l","IvviPig3At12","HQK5zGP749m3","7YVyJ-IgBZ7s","L5VlI3VKg_dV","wgqoOGqEiUre","jscaE3K65HP9","UMo588t4rY-O","AGs4jgnSsZa1"],"provenance":[],"toc_visible":true},"kernelspec":{"display_name":"base","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.9.13"},"vscode":{"interpreter":{"hash":"1561eddc5e0c9c74df968f74d5080d02882967127f956e6e7049c43d2ef42321"}}},"nbformat":4,"nbformat_minor":0}